Introduction to NeXL

Nicholas W. M. Ritchie 30-Apr-2023 (nicholas.ritchie@nist.gov)

What is NeXL?

NeXL is a collection of packages (libraries) for the Julia language that implement common X-ray microanalysis algorithms, physical data and utilities.

The core NeXL packages are

  • NeXLCore - Defines core concepts like elements, X-rays, atomic shells, k-ratios and fundamental electron and X-ray algorithms.

  • NeXLMatrixCorrection - Implements algorithms for performing matrix correction for bulk samples

  • NeXLSpectrum - Implements data types and algorithms to handle spectra and hyper-spectra

NeXL builds on many packages from the Julia infrastructure including

  • Gadfly - A "grammar of graphics" plotting package

  • DataFrames - "Pandas-like" tools for working with tabular data

Why Julia?

Julia is a high-performance scripting language designed for numerical data analysis. The base Julia libraries provide basic scalar, vector and matrix numerical analysis tools. Julia differs from most scriptable languages in that all code is compiled before execution. This leads to a "slow then fast" user experience. The first time a method is used, it and all the methods it depends upon must be compiled. This can lead to a significant delay. However, once compiled, the code is fast - often as fast as C or Fortran code. You will notice this "slow then fast" nature as you work with this notebook. It is particularly evident when creating plots.

NeXLCore

We will start by exploring NeXLCore since this package provides core functionality for the other NeXL packages.

println(VERSION)
Threads.nthreads()
1.9.0
18
using DrWatson
@quickactivate "IUMAS"

using NeXLCore

Elements

We will first construct a datum representing an element and then manipulate this object to extract elemental data.

Julia is not an object-oriented language. Instead it uses what is called "multiple dispatch" to determine which method implementation to call dependent on the arguments handed to the function.

el = n"Tc"  # Here we are using a special syntactic sugar to convert the string "Tc" into an Element struct
Technetium (Tc), number 43:
categorytransition metal
atomic mass98.0 u
density11.0 g/cm³
molar heat24.27 J/mol⋅K
melting point2430.0 K
boiling point4538.0 K
phaseSolid
shells[2, 8, 18, 13, 2]
electron configuration1s² 2s² 2p⁶ 3s² 3p⁶ 4s² 3d¹⁰ 4p⁶ 5s² 4d⁵
appearanceshiny gray metal
summaryTechnetium (/tɛkˈniːʃiəm/) is a chemical element with symbol Tc and atomic number 43. It is the element with the lowest atomic number in the periodic table that has no stable isotopes:every form of it is radioactive. Nearly all technetium is produced synthetically, and only minute amounts are found in nature.
discovered byEmilio Segrè
sourcehttps://en.wikipedia.org/wiki/Technetium

Let's get some properties of this elemental datum.

z(el), a(el), symbol(el), name(el)
(43, 98.0, "Tc", "Technetium")

We can combine elements into materials. The mat"..." macro converts expressions into Material structs.

mat = mat"Ca(PO4)3OH"
mat, typeof(mat)
(Ca(PO4)3OH[H=0.0029,O=0.6082,P=0.2717,Ca=0.1172], NeXLCore.Material{Float6
4, Float64})

An alternative syntax allows you to specify additional properties of the Material.

mat = parse(Material, "Ca(PO4)3OH", name = "Apatite", density = 2.2)
Apatite[H=0.0029,O=0.6082,P=0.2717,Ca=0.1172,2.20 g/cm³]

In addition to being able to parse chemical formulae, NeXLCore can also parse expressions like the next one. In this case, the multiplier represents the mass fraction of the element. The right hand side can also be a chemical formula as though the material was made from various mass fractions of consituent materials.

adm = parse(Material, "0.3399*O+0.0664*Al+0.0405*Si+0.0683*Ca+0.0713*Ti+0.1055*Zn+0.3037*Ge", name="ADM-6005a")
ADM-6005a[O=0.3399,Al=0.0664,Si=0.0405,Ca=0.0683,Ti=0.0713,Zn=0.1055,Ge=0.3
037]

In addition to elements and materials, NeXLCore provides mechanisms to work with characteristic X-rays, atomic shells and other atomic physics data. The n"..." macro can be used to create Element, CharXRay, AtomicSubShell and SubShell structs.

cxr = n"Tc K-L3"
cxr, typeof(cxr)
(Tc K-L3, NeXLCore.CharXRay)
ass = n"Tc K"
ass, typeof(ass)
(Tc K, NeXLCore.AtomicSubShell)

Let's get some properties of the CharXRay datum. You'll notice that the energy is in electron-volts as is the case for all energies within NeXL.

energy(cxr), inner(cxr), energy(inner(cxr)), outer(cxr), energy(outer(cxr)), typeof(inner(cxr))
(18367.0, Tc K, 21050.0, Tc L3, 2683.0, NeXLCore.AtomicSubShell)

To determine every characteristic X-ray generated by a particular element, you can use the characteristic(...). To generate sub-sets of the transitions replace alltransitions with ktransitions, ltransitions or mtransitions.

cxrs=characteristic(el, alltransitions)
cxrs, length(cxrs)
(Tc K-L3 + 9 others, Tc L3-M5 + 23 others & Tc M5-N3 + 9 others, 44)

Now we can use various functions to extract X-ray related physical data from elements and materials. The mac(...) function returns the mass absorption coeffficient in $cm^2/g$.

mac(mat, n"Ca K-L3"), mac(n"Ca", n"Ca K-L3"), mac(mat, 3692.0), mac(n"Ca", 3692.0)
(257.51384578832244, 139.20100028283625, 257.51384578832244, 139.2010002828
3625)

This is an example of one of Julia's core design features - multiple dispatch. The function mac(...) is called on four distinct argument types.

  • mac(Material, CharXRay)

  • mac(Element, CharXRay)

  • mac(Material, Float64)

  • mac(Element, Float64)

This is similar but extends on the way object oriented code can use the object type to determine which function implementation to call.

NeXL uses the Gadfly library to plot spectra and other data items.

using Gadfly
plot(e->mac(mat, e), 100.0, 1.0e4, Scale.y_log10)
plot(
    layer(e->mac(mat, e), 100.0, 1.0e4, Theme(default_color="Red")),
    layer(e->mac(n"Ca", e), 100.0, 1.0e4, Theme(default_color="Green")),
    layer(e->mac(n"P", e), 100.0, 1.0e4, Theme(default_color="Blue")),
    Scale.y_log10
)

There is much more functionality in NeXLCore, but that is enought to get us started. Next let's take a look at NeXLSpectrum.

NeXLSpectrum

NeXLSpectrum defines methods for reading, writing, manipulating, plotting and fitting X-ray spectra and X-ray spectrum images ("hyperspectra").

using NeXLSpectrum

First, we will read a spectrum from disk using the loadspectrum(...) method. It take a filename and is able to sniff the file type for many common spectrum file formats.

sp = loadspectrum(joinpath(datadir(), "exp_raw", "ADM6005a spectra","ADM-6005a_1.msa"))
*                                                                        
                          128860.0
  *                                                                        
                         
  *                                                                        
                         
  *                                                                        
                         
  *                                                                        
                         
  *                                                                        
                         
  *     *                                                                  
                         
  *     *                                                                  
                         
  *     *                                                                  
                         
  * ** **                                                                  
                         
  * ** *****        **  **                                                 
                         
***************************************************************************
************************* 20.0 keV
Spectrum{Float64}[ADM-6005a_1, -484.20818 + 5.01716⋅ch eV, 4096 ch, 20.0 ke
V, unknown composition, 6.81e6 counts]
set_default_plot_size(10inch, 4inch)
spec_files = [ joinpath(datadir(), "exp_raw", "ADM6005a spectra","ADM-6005a_$i.msa") for i in 1:15 ]
specs=loadspectrum.(spec_files)
plot(specs..., klms=[n"O", n"Al", n"Si", n"Ca", n"Ti", n"Zn", n"Ge"])
plot(specs..., klms=[n"O", n"Al", n"Si", n"Ca", n"Ti", n"Zn", n"Ge"], xmax=6.0e3)

To quantify this data, we need standard spectra. The standard spectra are preprocessed to optimize the fitting process. The spectrum files are read from disk and must contain the following properties :BeamEnergy, :LiveTime, :ProbeCurrent and :TakeOffAngle to be fit and quantified. It is possible to pre-load the standard spectra and edit the properties in the script if they are not present in the spectrum file.

path=joinpath(datadir(), "exp_raw", "ADM6005a spectra")
refs=references( [
        reference(n"O", joinpath(path, "SiO2 std.msa"), mat"SiO2"),
        reference(n"Si", joinpath(path, "SiO2 std.msa"), mat"SiO2"),
        reference(n"Al", joinpath(path, "Al std.msa"), mat"Al"),
        reference(n"Ca", joinpath(path, "CaF2 std.msa"), mat"CaF2"),
        reference(n"Ti", joinpath(path, "Ti trimmed.msa"), mat"Ti"),
        reference(n"Zn", joinpath(path, "Zn std.msa"), mat"Zn"),
        reference(n"Ge", joinpath(path, "Ge std.msa"), mat"Ge"),
    ], 132.0
)
using DataFrames
asa(DataFrame, refs)  # Describe the references in a DataFrame
12×9 DataFrame
RowSpectrumBeam Energy (keV)Probe Current (nA)Live Time (s)MaterialLinesROIP-to-BS-to-N
StringFloat64Float64Float64Material…Array…UnitRang…Float64Float64
1SiO220.05.0100.0SiO2[O=0.5326,Si=0.4674]O K-L3 + 1 other175:22821.87884119.94
2SiO220.05.0100.0SiO2[O=0.5326,Si=0.4674]Si K-L3 + 3 others409:49120.09388705.35
3Al20.05.0100.001Al[Al=1.0000]Al K-L3 + 3 others360:43319.853714014.8
4CaF220.02.5100.0CaF2[F=0.4867,Ca=0.5133]Ca K-L3 + 3 others788:93819.44665767.9
5Ti20.05.0100.001Ti[Ti=1.0000]Ti L3-M5 + 13 others153:2264.70571092.39
6Ti20.05.0100.001Ti[Ti=1.0000]Ti K-L3 + 3 others948:112522.53310719.4
7Zn20.05.0100.002Zn[Zn=1.0000]Zn L3-M5 + 13 others247:34817.61779149.2
8Zn20.05.0100.002Zn[Zn=1.0000]Zn K-L3 + 1 other1753:188213.614272.31
9Zn20.05.0100.002Zn[Zn=1.0000]Zn K-M3 + 3 others1945:20652.6177686.219
10Ge20.05.0100.001Ge[Ge=1.0000]Ge L3-M5 + 15 others277:39714.90059088.97
11Ge20.05.0100.001Ge[Ge=1.0000]Ge K-L3 + 1 other1997:213510.57293025.67
12Ge20.05.0100.001Ge[Ge=1.0000]Ge K-M3 + 5 others2223:23581.93878478.809

The references are then used to fit the unknown spectra and matrix correction is performed. The quantify(...) method can be used to perform both operations or the fit_spectrum(...) method can be used to fit the references to the unknowns to produce k-ratios and then the k-ratios fed to the quantify(...) method to perform matrix correction. The later two-step process is useful when you want to access the k-ratio data.

qrs=map(sp->quantify(sp, refs),specs)
qdf=asa(DataFrame, qrs, nominal=adm)
16×9 DataFrame
RowMaterialOAlSiCaTiZnGeTotal
StringAbstract…Abstract…Abstract…Abstract…Abstract…Abstract…Abstract…Abstract…
1ADM-6005a_10.3807590.06802230.03882770.06394990.07335950.1142640.3188521.058035e+00 ± 0.0e+00
2ADM-6005a_20.3796430.06789270.03890750.06393140.07289920.1140220.3195171.056813e+00 ± 0.0e+00
3ADM-6005a_30.3817520.06809040.03870230.06404760.07305050.1141830.3175111.057336e+00 ± 0.0e+00
4ADM-6005a_40.380670.0680380.03895730.06379970.0730060.1141540.318971.057595e+00 ± 0.0e+00
5ADM-6005a_50.3808510.06835360.03866460.06400860.07307130.1141850.3177271.056861e+00 ± 0.0e+00
6ADM-6005a_60.3813980.06828510.03881310.06369240.07297450.1137070.3171951.056065e+00 ± 0.0e+00
7ADM-6005a_70.3821090.06785880.03891130.06402670.07322650.1136660.3190451.058844e+00 ± 0.0e+00
8ADM-6005a_80.3811850.0682170.03903830.0640360.07311370.1138160.3194571.058862e+00 ± 0.0e+00
9ADM-6005a_90.3812490.06817750.03896370.06397980.07304210.1136190.3165891.055620e+00 ± 0.0e+00
10ADM-6005a_100.3803730.06830980.03893390.0637950.07283260.1142850.3180741.056603e+00 ± 0.0e+00
11ADM-6005a_110.3810090.06807630.03904290.06398640.07295920.1139970.3188051.057876e+00 ± 0.0e+00
12ADM-6005a_120.3817740.0679320.0389310.06392850.07275760.1139030.3178321.057057e+00 ± 0.0e+00
13ADM-6005a_130.3816880.06814770.03882430.06408710.07294260.1138030.3187031.058195e+00 ± 0.0e+00
14ADM-6005a_140.381680.06782880.03898960.06399930.07287090.1137780.3182761.057423e+00 ± 0.0e+00
15ADM-6005a_150.3823180.06829550.03879060.06404590.07310220.1142170.3186651.059434e+00 ± 0.0e+00
16ADM-6005a0.33990.06640.04050.06830.07130.10550.30370.9956

We can use the describe(...) method from DataFrames to generate summary statistics.

describe(qdf[1:end-1,2:end], :mean, :std, :min, :max)
8×5 DataFrame
Rowvariablemeanstdminmax
SymbolFloat64Union…Abstract…Abstract…
1O0.3812310.0007045080.3796430.382318
2Al0.06810170.0001724220.06782880.0683536
3Si0.03888660.0001132230.03866460.0390429
4Ca0.06395430.0001109570.06369240.0640871
5Ti0.07301390.0001534520.07275760.0733595
6Zn0.1139730.00023190.1136190.114285
7Ge0.3183480.0008457660.3165890.319517
8Total1.057511.055620e+00 ± 0.0e+001.059434e+00 ± 0.0e+00

Or we can take a closer look at the data from one spectrum.

asa(DataFrame, qrs[1])
7×14 DataFrame
RowLabelElementStandardXraysCΔCkΔkgZAFcgZAFc
StringStringStringStringFloat64Float64Float64?Float64?Float64?Float64?Float64?Float64?Float64?Float64?
1ADM-6005a_1TiTiTi K-L3 + 3 others0.073360.000120.06432760.0001032821.00.9256340.9473281.01.00.876879
2ADM-6005a_1AlAlAl K-L3 + 3 others0.0680220.000120.02807664.99883e-51.01.033960.3986851.00131.00.412758
3ADM-6005a_1GeGeGe K-L3 + 1 other0.318850.000640.2634790.0005254191.00.8314310.9938681.01.00.826333
4ADM-6005a_1CaCaF2Ca K-L3 + 3 others0.063958.9e-50.1213890.0001686611.01.043120.9265011.008221.00.974395
5ADM-6005a_1ZnZnZn K-L3 + 1 other0.114260.000280.1117120.0002724371.00.8793670.9995011.112331.00.977661
6ADM-6005a_1SiSiO2Si K-L3 + 3 others0.0388289.1e-50.05305520.0001241761.01.110120.5750321.000571.00.63872
7ADM-6005a_1OSiO2O K-L3 + 1 other0.380760.000530.4875580.0006728071.01.109080.6149520.9998681.00.681942

Associated with each spectrum are a collection of properties. These properties are often read from the spectrum file but can also be assigned manually.

NeXLCore.properties(sp)
Dict{Symbol, Any} with 11 entries:
  :Owner           => "Bruker Nano"
  :RealTime        => 112.108
  :BeamEnergy      => 20000.0
  :Title           => "Bruker Nano spectrum Acquisition"
  :Elevation       => 0.698132
  :LiveTime        => 100.001
  :Filename        => "C:\\Users\\nritchie\\Desktop\\IUMAS\\data\\exp_raw\\
ADM6…
  :ProbeCurrent    => 5.0
  :TakeOffAngle    => 0.698132
  :Name            => "ADM-6005a_1"
  :AcquisitionTime => DateTime("2019-07-12T09:33:00")

Properties are identified with a Symbol which is represented by a colon ':' plus a name. There are many predefined spectrum properties but you can also define your own to associate custom data with spectra.

sp[:BeamEnergy], sp[:ProbeCurrent], sp[:LiveTime], sp[:TakeOffAngle], dose(sp)
(20000.0, 5.0, 100.001, 0.6981317007977318, 500.005)

This is one way to index spectra to extract data from the spectrum. It is also possible to extract channel count data using various different types as indices. To demonstrate this, we will use Integer, AbstractFloat or CharXRay types to index the same channel in the spectrum - the channel associated with the Ca K-L3 peak.

channel(n"Ca K-L3", sp), channel(Float64, n"Ca K-L3", sp), channel(energy(n"Ca K-L3"), sp), channel(Float64, energy(n"Ca K-L3"), sp)
(833, 833.384891053903, 833, 833.384891053903)

In the first case, we index the channel directly by channel index. In the second case, we use a floating point number which is assumed to represent an energy to index the same channel. Finally, we use a CharXRay to index the same channel.

sp[833], sp[energy(n"Ca K-L3")], sp[n"Ca K-L3"]
(18042.0, 18042.0, 18042.0)

Often it is useful to be able to determine which of a collection of spectra collected from a single material under similar conditions are similar to one another. The notion being that the 'best' spectra are those that are most similar. These can then be summed together to produce a single spectrum that best represents the material. The findsimilar(...) method extracts the similar spectra from a list and removes outliers.

plot(sum(findsimilar(specs, refs.detector, n"O")), klms=[n"O", n"Al", n"Si", n"Ca", n"Ti", n"Zn", n"Ge"])

The similarity is determined by comparing each spectrum with the sum of the other spectra and calculating a score that respects count statistics. Similar spectra have a similarity(...) score of approximately unity.

DataFrame(Name=name.(specs), S1=NeXLSpectrum.similarity(specs, refs.detector, adm), S2=NeXLSpectrum.similarity(specs), Counts=integrate.(specs))
15×4 DataFrame
RowNameS1S2Counts
StringFloat64Float64Float64
1ADM-6005a_10.8614811.041116.81189e6
2ADM-6005a_20.8520620.9704976.81126e6
3ADM-6005a_30.8546750.999766.81006e6
4ADM-6005a_40.8542780.9802646.81625e6
5ADM-6005a_50.890410.9806026.80921e6
6ADM-6005a_60.8909280.9624676.8086e6
7ADM-6005a_70.8412130.9766286.81929e6
8ADM-6005a_80.7788260.9711666.81616e6
9ADM-6005a_90.838671.016526.816e6
10ADM-6005a_100.8473911.003536.81184e6
11ADM-6005a_110.8616670.9754456.81146e6
12ADM-6005a_120.8231051.014216.81786e6
13ADM-6005a_130.8143980.9742836.81113e6
14ADM-6005a_140.7775660.9723486.81502e6
15ADM-6005a_150.7807070.9424426.81972e6

HyperSpectra with NeXLSpectrum

A HyperSpectrum represents a multi-dimensional array of point spectra. NeXLSpectrum can handle a arbitrary number of dimensions like 1-D (linescans), 2-D (area scans) or higher dimensions.

There are currently three ways to load a hyperspectrum.

  • From a Lispix-style RPL/RAW file pair

  • From a SEMantics-style PTZ file

  • From a HyperSpy-style HDF5 file

We will use the DataDeps library to download the hyperspectrum data on demand from the NIST data website. You will need to be connected to the Internet.

# DataDeps is a utility for downloading data on demand from a URL
using DataDeps

# Where do I want to put the data?
ENV["DATADEPS_LOAD_PATH"] = joinpath(datadir(), "exp_raw")
ENV["DATADEPS_ALWAYS_ACCEPT"]="true"

# Register the data sets using the names "MnDeepSeaNodule" and "MnDeepSeaNoduleStandards"
register(DataDep("MnDeepSeaNodule",
    """
    Dataset:       Deep Sea Manganese Nodule
    Acquired by:   Nicholas W.M. Ritchie
    License:       CC-SA 3.0
    """,
    raw"https://data.nist.gov/od/ds/mds2-2467/MnNodule.tar.gz",
    "5b5b6623b8f4daca3ff3073708442ac5702ff690aa12668659875ec5642b458d",
    post_fetch_method=DataDeps.unpack
));

register(DataDep("MnDeepSeaNoduleStandards",
    """
    Dataset:       Standard Spectra for Deep Sea Manganese Nodule Dataset
    Acquired by:   Nicholas W.M. Ritchie
    License:       CC-SA 3.0
    """,
    raw"https://data.nist.gov/od/ds/mds2-2467/MnNodule_Standards.tar.gz",
    "69283ba72146932ba451e679cf02fbd6b350f96f6d012d50f589ed9dd2e35f1a",
    post_fetch_method=DataDeps.unpack
));

The data is in RPL/RAW format - a format designed by David Bright at NIST and available from many EDS vendor's software. In addition to the data in the RPL/RAW format, we also need to provide spectrum meta-data and the detector energy scale.

raw = readrplraw(joinpath(datadep"MnDeepSeaNodule","map[15]"))
les = LinearEnergyScale(0.0, 10.0)
props = Dict{Symbol,Any}(
    :LiveTime => 0.72*4.0*18.0*3600.0/(1024.0*1024.0),
    :BeamEnergy => 20.0e3,
    :TakeOffAngle => deg2rad(35.0),
    :ProbeCurrent => 1.0,
    :Name => "Deep Sea Mn Nodule",
)
hs = HyperSpectrum(les, props, raw, fov = (0.2, 0.2))

1024 × 1024 HyperSpectrum{UInt16,(:Y, :X)}[Deep Sea Mn Nodule), 0.0 + 10.0⋅ch eV, 2048 ch]

It is easy to generate crude elemental maps from HyperSpectrum structs. These are simple ROI maps constructed by summing a range of channels around the central channel for the specified X-ray transition.

hs[n"Mn K-L3"]

Or I can create a RGB-colorized image where R represents an ROI around the first chararacteristic X-ray, green the second and blue the third.

hs[n"Mn K-L3", n"Si K-L3", n"O K-L3"]

Much like spectra, hyperspectra have properties. However, the same set of properties apply to all spectra within a hyperspectrum. There is one exception, :LiveTime, which can be an array of the same dimension as the hyperspectrum to represent a unique live-time per pixel.

NeXLCore.properties(hs)
Dict{Symbol, Any} with 5 entries:
  :Name         => "Deep Sea Mn Nodule"
  :BeamEnergy   => 20000.0
  :LiveTime     => 0.177979
  :TakeOffAngle => 0.610865
  :ProbeCurrent => 1.0

To speed up processing bin the data using a block size > 1. 4 will provide 16x faster processing but lower resolution results.

# This speeds up the processing by binning a 1024 x 1024 hyperspectrum to 1024/block_size x 1024/block_size.
block_size = 1
hs = block_size > 1 ? block(hs, (block_size, block_size)) : hs

1024 × 1024 HyperSpectrum{UInt16,(:Y, :X)}[Deep Sea Mn Nodule), 0.0 + 10.0⋅ch eV, 2048 ch]

Unfortunately, ROI maps suffer from many shortcomings including

  • not being background corrected to eliminate the continuum signal

  • not being scaled for differences in generation efficiency between different characteristic X-rays.

While they are convenient as a first glimse at the data, they can be very deceptive and should rarely be included in reports.

Instead, it is better to fit the point spectra within the hyperspectrum using measured references.

hs_elms = [ n"Ag", n"Al", n"Ba", n"C", n"Ca", n"Ce", n"Cr", n"Cu", n"Fe", n"S", n"P", n"K", n"Mg", n"O", n"Mn", n"Na", n"Cl", n"Ni", n"Si", n"Ti", n"Zn" ]
plot(maxpixel(hs), klms = hs_elms, xmax=10.0e3)
print(symbol.(sort(hs_elms)))
["C", "O", "Na", "Mg", "Al", "Si", "P", "S", "Cl", "K", "Ca", "Ti", "Cr", "
Mn", "Fe", "Ni", "Cu", "Zn", "Ag", "Ba", "Ce"]
refs = references( [
        reference(n"C", joinpath(datadep"MnDeepSeaNoduleStandards", "C std.msa")),
        reference(n"O", joinpath(datadep"MnDeepSeaNoduleStandards", "MgO std.msa")),
        reference(n"Na", joinpath(datadep"MnDeepSeaNoduleStandards", "NaCl std.msa")),
        reference(n"Mg", joinpath(datadep"MnDeepSeaNoduleStandards", "Mg std.msa")),
        reference(n"Al", joinpath(datadep"MnDeepSeaNoduleStandards", "Al std.msa")),
        reference(n"Si", joinpath(datadep"MnDeepSeaNoduleStandards", "Si std.msa")),
        reference(n"P", joinpath(datadep"MnDeepSeaNoduleStandards", "GaP std.msa")), 
        reference(n"S", joinpath(datadep"MnDeepSeaNoduleStandards", "FeS2 std.msa")), 
        reference(n"Cl", joinpath(datadep"MnDeepSeaNoduleStandards", "NaCl std.msa")), 
        reference(n"K", joinpath(datadep"MnDeepSeaNoduleStandards", "KBr std.msa")), 
        reference(n"Ca", joinpath(datadep"MnDeepSeaNoduleStandards", "CaF2 std.msa")), 
        reference(n"Ti", joinpath(datadep"MnDeepSeaNoduleStandards", "Ti std.msa")), 
        reference(n"Cr", joinpath(datadep"MnDeepSeaNoduleStandards", "Cr std.msa")), 
        reference(n"Mn", joinpath(datadep"MnDeepSeaNoduleStandards", "Mn std.msa")), 
        reference(n"Fe", joinpath(datadep"MnDeepSeaNoduleStandards", "Fe std.msa")), 
        reference(n"Ni", joinpath(datadep"MnDeepSeaNoduleStandards", "Ni std.msa")), 
        reference(n"Cu", joinpath(datadep"MnDeepSeaNoduleStandards", "Cu std.msa")), 
        reference(n"Zn", joinpath(datadep"MnDeepSeaNoduleStandards", "Zn std.msa")), 
        reference(n"Ag", joinpath(datadep"MnDeepSeaNoduleStandards", "Ag std.msa")), 
        reference(n"Ba", joinpath(datadep"MnDeepSeaNoduleStandards", "BaF2 std.msa"))     
    ], 132.0)

asa(DataFrame, refs)
33×9 DataFrame
8 rows omitted
RowSpectrumBeam Energy (keV)Probe Current (nA)Live Time (s)MaterialLinesROIP-to-BS-to-N
SubStrin…Float64Float64Float64Material…Array…UnitRang…Float64Float64
1C std20.00.973721184.6C[C=1.0000,1.90 g/cm³]C K-L3 + 1 other16:4024.14315279.7
2MgO std20.00.97481161.48MgO[O=0.3970,Mg=0.6030,5.00 g/cm³]O K-L3 + 1 other39:6612.78898706.54
3NaCl std20.00.974541156.51NaCl[Na=0.3934,Cl=0.6066,5.00 g/cm³]Na K-L3 + 1 other89:1206.218399555.21
4Mg std20.00.974751149.05Mg[Mg=1.0000,1.74 g/cm³]Mg K-L3 + 1 other110:14217.447134147.0
5Al std20.00.973961151.25Al[Al=1.0000,1.00 g/cm³]Al K-L3 + 3 others132:16919.638936840.4
6Si std20.00.974651149.58Si[Si=1.0000,2.95 g/cm³]Si K-L3 + 3 others157:19825.039340461.5
7GaP std20.00.974511161.35GaP[P=0.3076,Ga=0.6924,5.00 g/cm³]P K-L3 + 3 others184:2308.271729393.98
8FeS2 std20.00.973341160.73FeS2[S=0.5345,Fe=0.4655,5.00 g/cm³]S K-L3 + 3 others212:26416.838821532.6
9NaCl std20.00.974541156.51NaCl[Na=0.3934,Cl=0.6066,5.00 g/cm³]Cl K-L3 + 3 others243:30023.51225791.1
10KBr std20.00.973921147.26KBr[K=0.3286,Br=0.6714,5.00 g/cm³]K K-L3 + 3 others310:3797.045759501.98
11CaF2 std20.00.974011167.54CaF2[F=0.4867,Ca=0.5133,3.18 g/cm³]Ca K-L3 + 3 others347:42217.527419125.9
12Ti std20.00.97521160.26Ti[Ti=1.0000,5.00 g/cm³]Ti L3-M5 + 13 others28:654.350943026.39
13Ti std20.00.97521160.26Ti[Ti=1.0000,5.00 g/cm³]Ti K-L3 + 3 others427:51620.563626561.6
22Fe std20.00.974361164.66Fe[Fe=1.0000,7.87 g/cm³]Fe K-M3 + 3 others680:7323.778543010.53
23Ni std20.00.972581164.08Ni[Ni=1.0000,5.00 g/cm³]Ni L3-M5 + 13 others62:10810.005211386.8
24Ni std20.00.972581164.08Ni[Ni=1.0000,5.00 g/cm³]Ni K-L3 + 1 other718:77816.041314413.1
25Ni std20.00.972581164.08Ni[Ni=1.0000,5.00 g/cm³]Ni K-M3 + 3 others799:8553.183282401.56
26Cu std20.00.972421159.85Cu[Cu=1.0000,8.90 g/cm³]Cu L3-M5 + 13 others69:11715.370319251.1
27Cu std20.00.972421159.85Cu[Cu=1.0000,8.90 g/cm³]Cu K-L3 + 1 other773:83614.093612296.1
28Cu std20.00.972421159.85Cu[Cu=1.0000,8.90 g/cm³]Cu K-M3 + 3 others862:9202.930672123.89
29Zn std20.00.972921157.03Zn[Zn=1.0000,5.00 g/cm³]Zn L3-M5 + 13 others76:12617.349723748.8
30Zn std20.00.972921157.03Zn[Zn=1.0000,5.00 g/cm³]Zn K-L3 + 1 other831:89612.506910519.0
31Zn std20.00.972921157.03Zn[Zn=1.0000,5.00 g/cm³]Zn K-M3 + 3 others928:9882.675581862.94
32Ag std20.00.98647921.028Ag[Ag=1.0000,5.00 g/cm³]Ag L3-M5 + 25 others247:3935.2894511615.2
33BaF2 std20.00.989861171.3BaF2[F=0.2167,Ba=0.7833,5.00 g/cm³]Ba L3-M5 + 29 others377:6172.854567169.28

Now we will fit the references to the hyperspectrum unknown. This is simple enough:

krs = fit_spectrum(hs, refs, mode = :Fast)

This is time consuming so once we've done it, we want to store the result in a file and reload the result rather than recalculate it. DrWatsons provides a mechanism to do this based on a function produce_or_load(....). You specify where to put the data and it builds a filename based on the prefix, suffix and the configuration paramters in @dict(mode, x_dim, y_dim). The suffix specifies that the data is written using the JLD2 library in HDF5 format.

mode = :Full  # Single threaded took 92 seconds for :Fast or 6000 seconds for :Full (many cores take about 1/6 this.)

# krs = @time fit_spectrum(hs, refs, mode = mode)  

using FileIO, Parameters
x_dim, y_dim = size(hs)
data, file = produce_or_load(
        joinpath(datadir(),"exp_pro"), # Path
        @dict(mode, x_dim, y_dim),     # Configuration
        prefix="kratios", suffix="jld2" # Filename parameters
    ) do config
        Dict("kratios" => @time fit_spectrum(hs, refs, mode = mode, sigma=3.0)) # Zero k-ratios less that 3σ
end
@show file
krs = data["kratios"]
1219.801389 seconds (2.39 G allocations: 2.918 TiB, 44.46% gc time, 0.01% c
ompilation time)
file = "C:\\Users\\nritchie\\Desktop\\IUMAS\\data\\exp_pro\\kratios_mode=Fu
ll_x_dim=1024_y_dim=1024.jld2"
33-element Vector{NeXLCore.KRatios}:
 k[C, C K-L3 + 1 other] = Float32[ (1024, 1024) ]
 k[MgO, O K-L3 + 1 other] = Float32[ (1024, 1024) ]
 k[NaCl, Na K-L3 + 1 other] = Float32[ (1024, 1024) ]
 k[Mg, Mg K-L3 + 1 other] = Float32[ (1024, 1024) ]
 k[Al, Al K-L3 + 3 others] = Float32[ (1024, 1024) ]
 k[Si, Si K-L3 + 3 others] = Float32[ (1024, 1024) ]
 k[GaP, P K-L3 + 3 others] = Float32[ (1024, 1024) ]
 k[FeS2, S K-L3 + 3 others] = Float32[ (1024, 1024) ]
 k[NaCl, Cl K-L3 + 3 others] = Float32[ (1024, 1024) ]
 k[KBr, K K-L3 + 3 others] = Float32[ (1024, 1024) ]
 ⋮
 k[Ni, Ni K-M3 + 3 others] = Float32[ (1024, 1024) ]
 k[Cu, Cu L3-M5 + 13 others] = Float32[ (1024, 1024) ]
 k[Cu, Cu K-L3 + 1 other] = Float32[ (1024, 1024) ]
 k[Cu, Cu K-M3 + 3 others] = Float32[ (1024, 1024) ]
 k[Zn, Zn L3-M5 + 13 others] = Float32[ (1024, 1024) ]
 k[Zn, Zn K-L3 + 1 other] = Float32[ (1024, 1024) ]
 k[Zn, Zn K-M3 + 3 others] = Float32[ (1024, 1024) ]
 k[Ag, Ag L3-M5 + 25 others] = Float32[ (1024, 1024) ]
 k[BaF2, Ba L3-M5 + 29 others] = Float32[ (1024, 1024) ]

The result of fit_spectrum(...) is a set k-ratio maps. Which can be readily visualized and represent a more quantitative perspective on the spectrum data. The principle advantage being that the data is background corrected so that continuum is not mistaken for a small quantity of an element.

To visualize the k-ratios, it is best to pick one k-ratio per element using optimizeks(...) and then normalize the k-ratios on a point-by-point basis using normalizek(...). This ensures that all the k-ratios are between 0 and 1. The function asa(ElementalMap, ...) facilitates this process by converting a Vector{KRatios} into a dictionary of images indexed by Element instances.

using Colors
imgs = asa(ElementalMap, krs, Gray)
imgs[n"Mn"]

We can perform matrix correction on the Vector{KRatios} using the quantify(...) method. It returns a Materials struct which is a memory efficient way of representing a multi-dimensional array of Material.

# quant=quantify(krs, name=hs[:Name], ty=Float32) # Takes ~80 minute for 1024 x 1024 (~5 ms per pixel)
data, file = produce_or_load(
        joinpath(datadir(),"exp_pro"),
        @dict(mode, x_dim, y_dim),
        prefix="quantify", suffix="jld2"
    ) do config
        q = @time quantify(krs, name=hs[:Name], ty=Float32)
        Dict("mass_fractions" => q)
end
@show file
quant = data["mass_fractions"]
1123.181981 seconds (28.98 G allocations: 2.967 TiB, 37.61% gc time, 0.00% 
compilation time)
file = "C:\\Users\\nritchie\\Desktop\\IUMAS\\data\\exp_pro\\quantify_mode=F
ull_x_dim=1024_y_dim=1024.jld2"
Materials[Deep Sea Mn Nodule, 1024 × 1024 × (C, O, Na, Mg, Al, Si, P, S, Cl, K, Ca, Ti, Cr, Mn, Fe, Ni, Cu, Zn, Ag, Ba)]

The Materials struct can be indexed at a specific coordinate to extract the composition at that coordinate as a Material.

ci = CartesianIndex(64 ÷ block_size,192 ÷ block_size)
plot(hs[ci], klms=quant[ci], xmax=10.0e3)
quant[ci]
Deep Sea Mn Nodule[64,192][C=0.1042,O=0.4443,Mg=0.0220,Al=0.0213,Si=0.0229,
Ca=0.0171,Mn=0.3364,Fe=0.0336,Ni=0.0233]

Or we can index the results using an Element in which case, we pull out a slice representing the mass fractions associated with that element.

t=quant[n"Mn"]
1024×1024 Matrix{Float32}:
 0.324086   0.311999   0.26446    …  0.190166   0.180911   0.182104
 0.271018   0.278042   0.221511      0.186287   0.18788    0.306695
 0.317517   0.0619518  0.244346      0.19922    0.213287   0.195424
 0.269474   0.249094   0.25204       0.221867   0.0691624  0.228209
 0.319719   0.249282   0.231007      0.17811    0.203394   0.199401
 0.256126   0.230807   0.23199    …  0.208834   0.232772   0.218939
 0.319657   0.193909   0.197841      0.353014   0.211844   0.216782
 0.218877   0.227499   0.229361      0.213233   0.228835   0.21699
 0.264534   0.251587   0.274179      0.194123   0.214168   0.214138
 0.262301   0.261231   0.240386      0.175014   0.185486   0.209481
 ⋮                                ⋱                        
 0.202788   0.161175   0.0921997  …  0.0        0.0        0.0
 0.148154   0.0981963  0.0627945     0.0        0.0        0.318118
 0.0944232  0.0660169  0.0689612     0.0        0.0        0.0
 0.101604   0.0735786  0.26724       0.0264943  0.0172222  0.0
 0.131552   0.12927    0.276316      0.0        0.0272936  0.0
 0.156501   0.208045   0.274624   …  0.0466484  0.0424684  0.0
 0.224822   0.23659    0.317166      0.0245856  0.0386291  0.0
 0.204602   0.293731   0.258414      0.0605041  0.0        0.0143856
 0.220293   0.303754   0.318462      0.0685915  0.0363392  0.0131115

Using broadcast, we can apply Material functions to Materials objects. The results is an Array with the same shape but with the result contents.

anal_tot=analyticaltotal.(quant)
1024×1024 Matrix{Float32}:
 1.04277   0.865111  0.966455  0.897278  …  0.739996  0.723382  0.673297
 0.774454  0.873582  0.687447  0.798705     0.735997  0.673719  0.875573
 0.925575  1.07874   0.872894  0.73056      0.698462  0.718494  0.740097
 0.770115  0.642361  0.813358  0.718284     0.628025  1.04408   0.651347
 0.855584  0.774087  0.670479  0.721183     0.454163  0.474307  0.520667
 0.82876   0.636944  0.810042  1.02239   …  0.593297  0.627422  0.50368
 0.765684  0.936689  0.779919  0.861888     0.81624   0.614459  0.627214
 0.706744  0.673572  0.836856  0.880228     0.736723  0.671359  0.589242
 0.878403  0.728341  0.837826  0.887324     0.710949  0.714305  0.610149
 0.865996  0.868738  0.883528  0.815937     0.862585  0.766835  0.655634
 ⋮                                       ⋱                      
 0.961802  0.88182   0.811786  1.0086    …  0.88926   0.953048  0.895114
 0.989449  0.992564  0.987073  0.972757     0.96085   1.02326   0.958229
 0.995291  0.990919  0.930657  0.88819      0.768663  0.843435  0.985657
 1.06438   1.09124   0.973773  0.917523     0.5491    0.694485  0.804928
 1.07485   0.96417   1.02073   0.979598     0.342118  0.549455  0.654145
 0.93915   1.05525   0.907607  0.923722  …  0.475692  0.534131  0.78958
 1.02361   0.958494  0.893507  0.756236     0.70853   0.711719  0.907362
 0.736653  0.880733  0.710251  0.838353     0.703628  0.87104   0.936011
 0.933375  0.954268  0.91374   0.927801     0.743118  0.851136  0.888548
using Statistics
mean(anal_tot), std(anal_tot)
(0.9605031f0, 0.18117538f0)

To convert the mass-fractions to images, we need to ensure all mass-fractions are on the range [0,1]. We can do this by normalizing the individual points to an analytical total of unity.

norm_quant=asnormalized(quant)
Materials[N[Deep Sea Mn Nodule], 1024 × 1024 × (C, O, Na, Mg, Al, Si, P, S, Cl, K, Ca, Ti, Cr, Mn, Fe, Ni, Cu, Zn, Ag, Ba)]

Now let's visualize this. In Julia, it is trivial to convert matrices representing values on the range [0,1] to images.

We will extract the plane of data corresponding to an element by indexing it with an Element. Log3Band is a function that converts numbers between 0 and 1 to RGB values. The mapping is displayed below in the legend.

Log3Band.(norm_quant[n"Ni"])
loadlegend("Log3BandBright.png")

The Ni elemental map can be seen to represent primarily values ranging from a few percent down to zero. Albeit, this spectrum image took 18 hours to collect but the depth of information might surprise many who are used to thinking of spectrum images as a crude tool.

Let's output various perspectives from each element to the 'plots' directory. The Gray function converts values on the range [0,1] to gray-scale values.

# Fast k-ratio maps
imgs = asa(ElementalMap, krs, Log3Band) 
for (el, img) in imgs
    save(joinpath(plotsdir(),"$(symbol(el)) - $mode Log3Band.png"), img)
end
# Full k-ratio maps
for (el, img) in asa(ElementalMap, krs, Gray) 
    save(joinpath(plotsdir(),"$(symbol(el)) - $mode Gray.png"), img)
end
# Full quantitative maps - Fast fit
for el in keys(norm_quant)
    pl = norm_quant[el]
    save(joinpath(plotsdir(),"$(symbol(el)) - Quant $mode Gray.png"), Gray.(pl))
    save(joinpath(plotsdir(),"$(symbol(el)) - Quant $mode Log3Band.png"), Log3Band.(pl))
end

Let's compare the results from the full fit with the results from the fast fit.

els = sort([ keys(imgs)...])
print(symbol.(els))
["C", "O", "Na", "Mg", "Al", "Si", "P", "S", "Cl", "K", "Ca", "Ti", "Cr", "
Mn", "Fe", "Ni", "Cu", "Zn", "Ag", "Ba"]

Plot the k-ratio images.

map( el->imgs[el], els)
(a vector displayed as a row to save space)

Plot the quantitative maps.

map(el->Log3Band.(norm_quant[el]), els)
(a vector displayed as a row to save space)

Clearly, the light element maps are where we see the influence of matrix correction. When X-ray absorption is small, the k-ratio is generally a good approximation of the mass-fraction. However, for low energy X-rays, the absorption can be strong leading to a significant underestimation of the mass fraction. We see this particularly in the carbon data.

To explore the data, let's plot histograms of the mass fractions from each element.

colors = distinguishable_colors(length(els), colorant"light blue", lchoices=range(50, stop=100, length=15))
    
plot(
    ( layer(x=quant[el], Geom.histogram, Theme(default_color=colors[i]), alpha=[0.6]) for (i, el) in enumerate(els) )...,
    Guide.xlabel("Mass Fraction"), Guide.ylabel("Counts"), 
    Guide.manual_color_key("Element", symbol.(els), colors),
    Coord.cartesian(xmin=0.0, xmax=1.0, ymax=16.0e4/(block_size^2))
)
plot(
    ( layer(x=quant[el], Geom.histogram, Theme(default_color=colors[i]), alpha=[0.6]) for (i, el) in enumerate(els) )...,
    Guide.xlabel("Mass Fraction"), Guide.ylabel("Counts"), 
    Guide.manual_color_key("Element", symbol.(els), colors),
    Coord.cartesian(xmin=0.0, xmax=0.05, ymax=16.0e4/block_size^2)
)

We can easily query the quantified data using broadcast syntax to create BitMatrix based on a boolean comparison. To determine which pixels have a mass-fraction of Si > 5% we create a mask. In this BitMatrix mask, 1 represents a true comparison and 0 a false comparison.

mask_si = quant[n"Si"] .> 0.05
1024×1024 BitMatrix:
 0  0  0  1  1  1  1  1  1  0  0  0  0  …  0  0  0  0  0  0  0  0  1  1  0 
 0
 0  0  0  0  0  0  1  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  1  0 
 0
 0  1  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  1  0 
 0
 0  0  0  0  1  0  0  0  0  0  0  0  0     0  0  0  0  0  1  0  0  0  0  1 
 0
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  1  0  0  0  0 
 0
 0  0  0  1  0  0  0  0  0  0  1  0  0  …  0  0  0  0  0  0  0  0  0  0  0 
 0
 0  1  1  0  0  0  0  0  0  0  0  0  0     0  0  0  0  0  0  0  0  0  0  0 
 0
 0  1  0  0  0  0  0  0  0  0  0  1  1     0  0  0  0  0  0  0  0  0  0  0 
 0
 0  0  0  0  1  0  0  0  0  0  1  1  1     0  0  0  0  0  0  0  0  0  0  0 
 0
 0  0  1  1  0  0  0  0  0  0  1  1  1     0  0  0  0  1  0  0  1  1  1  0 
 0
 ⋮              ⋮              ⋮        ⋱           ⋮              ⋮       
 
 1  1  1  1  1  1  1  1  1  0  1  1  0  …  1  0  1  1  1  1  1  1  1  1  1 
 1
 1  1  1  1  1  1  1  1  1  1  1  1  0     1  1  1  1  1  1  0  1  1  1  1 
 0
 1  1  1  1  1  1  0  0  1  0  1  1  0     1  0  1  1  1  1  0  1  0  1  0 
 0
 1  1  0  1  1  0  0  0  0  1  1  1  0     1  1  1  1  1  1  1  1  0  0  0 
 0
 1  1  1  0  0  0  0  0  0  0  1  0  0     1  1  1  1  1  0  1  1  0  0  0 
 0
 0  0  0  0  0  0  0  0  0  0  0  0  0  …  1  0  0  1  0  0  1  1  0  0  0 
 1
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  1  0  1  1  1  1  1  0  1  1 
 1
 0  0  0  0  0  0  0  0  0  0  0  0  0     0  0  0  1  1  1  1  0  1  1  1 
 1
 0  0  0  1  1  1  1  0  0  0  0  0  0     0  0  0  0  1  1  0  0  1  1  1 
 1

Using similar syntax we can determine how many pixels have at least 50%, 10%, 5% and 1% of an element?

DataFrame(
    Element=symbol.(els), 
    P50=map(el->count(quant[el] .> 0.5),els), 
    P10=map(el->count(quant[el] .> 0.1),els), 
    P5=map(el->count(quant[el] .> 0.05),els), 
    P1=map(el->count(quant[el] .> 0.01),els)
)
20×5 DataFrame
RowElementP50P10P5P1
StringInt64Int64Int64Int64
1C79013665407931411932032
2O27563103270810383231041335
3Na0285044217578
4Mg099895710823
5Al0504737633727059
6Si7120119268459927187
7P01966202307
8S0230114177
9Cl00123133
10K04853860268769
11Ca011575687808045
12Ti01414764639
13Cr002124
14Mn132863530917155961104
15Fe39182313505717800840
16Ni0613236499
17Cu071116724
18Zn275213503
19Ag0110708
20Ba019210875070

A mask can then be applied to extract and sum the requisite spectra within the hyperspectrum.

plot( sum(hs, quant[n"Si"] .> 0.05), klms=els, xmax=10.0e3)

Let's do something similar for Cu > 10%.

plot( sum(hs, quant[n"Cu"] .> 0.1), klms=els, xmax=10.0e3)

And silver...

plot( sum(hs, quant[n"Ag"] .> 0.05), klms=els, xmax=10.0e3)

Finally, to visualize the relative abundances and locations of the elements in the quantified data, we can construct a RGB image.

using ImageCore
colorview(RGB, norm_quant[n"Mn"], norm_quant[n"Si"], norm_quant[n"O"])

Compare this with the ROI map from earlier.

hs[n"Mn K-L3", n"Si K-L3", n"O K-L3"]

This is just a flavor of what you can do with spectra and hyperspectra using Julia, the Julia library infrastructure including the NeXL libraries. There are many machine learning, chemometric, image analysis and other libraries that can be readily applied.